On the combinatorics of the 2-class classification problem
نویسندگان
چکیده
A set of points X = XB ∪ XR ⊆ R is linearly separable if the convex hulls of XB and XR are disjoint, hence there exists a hyperplane separating XB from XR. Such a hyperplane provides a method for classifying new points, according to which side of the hyperplane the new points lie. When such a linear separation is not possible, it may still be possible to partition XB and XR into prespecified numbers of groups, in such a way that every group from XB is linearly separable from every group from XR. We may also discard some points as outliers, and seek to minimize the number of outliers necessary to find such a partition. Based on these ideas, Bertsimas and Shioda proposed the classification and regression by integer optimization (CRIO) method in 2007. In this work we explore the integer programming aspects of the classification part of CRIO, in particular theoretical properties of the associated formulation. We are able to find facet-inducing inequalities coming from the stable set polytope, hence showing that this classification problem has exploitable combinatorial properties.
منابع مشابه
A Novel One Sided Feature Selection Method for Imbalanced Text Classification
The imbalance data can be seen in various areas such as text classification, credit card fraud detection, risk management, web page classification, image classification, medical diagnosis/monitoring, and biological data analysis. The classification algorithms have more tendencies to the large class and might even deal with the minority class data as the outlier data. The text data is one of t...
متن کاملModified CLPSO-based fuzzy classification System: Color Image Segmentation
Fuzzy segmentation is an effective way of segmenting out objects in images containing both random noise and varying illumination. In this paper, a modified method based on the Comprehensive Learning Particle Swarm Optimization (CLPSO) is proposed for pixel classification in HSI color space by selecting a fuzzy classification system with minimum number of fuzzy rules and minimum number of incorr...
متن کاملFault diagnosis in a distillation column using a support vector machine based classifier
Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...
متن کاملImproving Chernoff criterion for classification by using the filled function
Linear discriminant analysis is a well-known matrix-based dimensionality reduction method. It is a supervised feature extraction method used in two-class classification problems. However, it is incapable of dealing with data in which classes have unequal covariance matrices. Taking this issue, the Chernoff distance is an appropriate criterion to measure distances between distributions. In the p...
متن کاملNegative Selection Based Data Classification with Flexible Boundaries
One of the most important artificial immune algorithms is negative selection algorithm, which is an anomaly detection and pattern recognition technique; however, recent research has shown the successful application of this algorithm in data classification. Most of the negative selection methods consider deterministic boundaries to distinguish between self and non-self-spaces. In this paper, two...
متن کاملImproving Imbalanced data classification accuracy by using Fuzzy Similarity Measure and subtractive clustering
Classification is an one of the important parts of data mining and knowledge discovery. In most cases, the data that is utilized to used to training the clusters is not well distributed. This inappropriate distribution occurs when one class has a large number of samples but while the number of other class samples is naturally inherently low. In general, the methods of solving this kind of prob...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1706.06214 شماره
صفحات -
تاریخ انتشار 2017